A clustering tool for nucleotide sequences using Laplacian Eigenmaps and Gaussian Mixture Models
نویسندگان
چکیده
We propose a new procedure for clustering nucleotide sequences based on the “Laplacian Eigenmaps” and Gaussian Mixture modelling. This proposal is then applied to a set of 100 DNA sequences from the mitochondrially encoded NADH dehydrogenase 3 (ND3) gene of a collection of Platyhelminthes and Nematoda species. The resulting clusters are then shown to be consistent with the gene phylogenetic tree computed using a maximum likelihood approach. This comparison shows in particular that the clustering produced by the methodology combining Laplacian Eigenmaps with Gaussian Mixture models is coherent with the phylogeny as well as with the NCBI taxonomy. We also developed a Python package for this procedure which is available online.
منابع مشابه
Speech Enhancement using Laplacian Mixture Model under Signal Presence Uncertainty
In this paper an estimator for speech enhancement based on Laplacian Mixture Model has been proposed. The proposed method, estimates the complex DFT coefficients of clean speech from noisy speech using the MMSE estimator, when the clean speech DFT coefficients are supposed mixture of Laplacians and the DFT coefficients of noise are assumed zero-mean Gaussian distribution. Furthermore, the MMS...
متن کاملMagnetic eigenmaps for community detection in directed networks
Communities in directed networks have often been characterized as regions with a high density of links, or as sets of nodes with certain patterns of connection. Our approach for community detection combines the optimization of a quality function and a spectral clustering of a deformation of the combinatorial Laplacian, the so-called magnetic Laplacian. The eigenfunctions of the magnetic Laplaci...
متن کاملThe Laplacian Eigenmaps Latent Variable Model
We introduce the Laplacian Eigenmaps Latent Variable Model (LELVM), a probabilistic method for nonlinear dimensionality reduction that combines the advantages of spectral methods—global optimisation and ability to learn convoluted manifolds of high intrinsic dimensionality—with those of latent variable models—dimensionality reduction and reconstruction mappings and a density model. We derive LE...
متن کاملImage Segmentation using Gaussian Mixture Model
Abstract: Stochastic models such as mixture models, graphical models, Markov random fields and hidden Markov models have key role in probabilistic data analysis. In this paper, we used Gaussian mixture model to the pixels of an image. The parameters of the model were estimated by EM-algorithm. In addition pixel labeling corresponded to each pixel of true image was made by Bayes rule. In fact,...
متن کاملFisher Vectors Derived from Hybrid Gaussian-Laplacian Mixture Models for Image Annotation
In the traditional object recognition pipeline, descriptors are densely sampled over an image, pooled into a high dimensional non-linear representation and then passed to a classifier. In recent years, Fisher Vectors have proven empirically to be the leading representation for a large variety of applications. The Fisher Vector is typically taken as the gradients of the log-likelihood of descrip...
متن کامل